DATA 304 Homework 5

Author

Lily McAboy

Exercise 1

Copy of Graphic:

r/DataIsBeautiful Source: r/dataisbeautiful

What marks are being used? What variables are mapped to which properties?

The marks are lines. Specific “Worst Day” is mapped to color (Worst day -> color).

time after occurrence (years) -> x
percentage change -> y
Worst Day Occurrence -> color

What is the main story of this graphic?

The main story of the graphic says that from the three of the “worst days” for the stock market in history from 1987 to 2020, it took 2008’s Great Recession almost 4 years to recover from it’s state before the drop. 1987’s Black Monday took approximately 2 years and COVID-19’s crash recovered less than a year later.

What makes it a good graphic?

I really like that the x axis is not necessarily a datetime. It counts the number of years until recovery. I think that this makes the most sense to compare the recovery times because if we had a datetime on the x axis, it might be hard to compare since each worst day are multiple years apart. I also like that the colors are clear contrasts of each other.

What features do you think you would know how to implement in Vega-Lite?

I would create a line graph, but would have to layer each of the “worst days” and then define the cyan, pink, and yellow for each of the days. I would have to use “background”: “black” to change the background color. If the recovery times were in datetimes, I would have to calculate the difference between the first date and each specific row in the dataset so that on the x axis, I am able to layer each of the days on top of each other.

I would also have to add a layer of text for the lower right text that explains the graphs.

Are there any features of the graphic that you would not know how to do in Vega-Lite? If so, list them.

I am not sure how to add the “extra” lines on the x axis. I think that most of my graphics just use Vega-Lite’s default formatting. I also would have to learn how to cut off the extra lines so it does not exceed 0% on the y axis.

Exercise 2

Visualizing Weather 1

I am having issues with my graphs. I have my first graphic to be graded and then the rest I am attaching to try to debug or figure out.

Create a graphic that shows the mean temperature for each month. How many “months” should you be displaying? (There is more than one answer to this – perhaps try doing it more that one way.)

Way 1

In this graphic, I am showing the mean max temperature per month, color is used as a guide to show the year for a specific month.

'
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
  "url": "https://calvin-data304.netlify.app/data/weather-with-dates.csv"
},
"height": 400,
"width": 500,
"title": "Mean Max Temperature Per Month",
"mark": "circle" ,
"encoding": {
  "x": {"field": "month", "type": "temporal", "timeUnit": "month","title": "Month"},
  "y": {"field": "temp_max", "type": "quantitative", "aggregate": "average", "title": "Temperature (F)"},
  "color": {"field": "year", "title": "Year"},
  "fillOpacity": {"value": 0.8},
  "size": {"value": 80}
}
}' |> as_vegaspec()

Way 2

Using R to calculate the actual mean temperature for each day, and then aggregating by month

seattle_data <- read.csv("https://calvin-data304.netlify.app/data/weather-with-dates.csv")
seattle_data$mean_daily_temp <- (seattle_data$temp_max + seattle_data$temp_min)/2
seattle_weather_json <- toJSON(seattle_data, pretty = TRUE)
seattle_mean_temp <- paste0(
'{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "data": {"values": ', seattle_weather_json, '},
  "height": 400,
  "width": 500,
  "title": "Mean Temperature Per Month",
  "mark": "circle",
  "encoding": {
    "x": {"field": "month", "type": "categorical"},
    "y": {"field": "mean_daily_temp", "type": "quantitative", "aggregate": "average", "title": "Temperature (F)"},
    "color": {"field": "year", "title": "Year"},
    "fillOpacity": {"value": 0.8},
    "size": {"value": 80}
  }
}'
) |> as_vegaspec()

Note: I cannot get this to render and can’t find a solution online. From what I found out, it may be my package versions or the R version that I have that doesn’t allow this to work. Sometimes it works and sometimes it doesn’t.

'
{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "data": {
    "url": "https://calvin-data304.netlify.app/data/weather-with-dates.csv"
  },
  "transform": [
    {
      "calculate": "(datum.temp_max + datum.temp_min) / 2",
      "as": "daily_mean"
    },
    {
      "timeUnit": "month",
      "field": "date",
      "as": "month"
    },
    {
      "aggregate": [
        {
          "op": "average",
          "field": "daily_mean",
          "as": "mean_of_means"
        }
      ],
      "groupby": ["month"]
    }
  ],
  "height": 400,
  "width": 500,
  "title": "Mean of Mean Temperatures Per Month",
  "mark": "circle",
  "encoding": {
    "x": {
      "field": "month",
      "type": "temporal",
      "timeUnit": "month",
      "title": "Month"
    },
    "y": {
      "field": "mean_of_means",
      "type": "quantitative",
      "title": "Mean of Daily Means (F)"
    },
    "color": {
      "field": "year",
      "title": "Year"
    },
    "fillOpacity": {
      "value": 0.8
    },
    "size": {
      "value": 80
    }
  }
}' |> as_vegaspec()

I can’t understand why there is an issue with this either. I thought I was calculating the mean daily temp and then aggregating that by month the same way I aggregated with the temp_max.

Visualizing Weather 2

Exercise 3 Create a graphic that shows how the different types of weather (rain, fog, etc.) are distributed by month in Seattle. When is it rainiest in Seattle? Sunniest?

'{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
  "url": "https://calvin-data304.netlify.app/data/weather-with-dates.csv"
},
"height": 400,
"width": 500,
"title": "Seattle Weather Type Distribution (2012-2015)",
"mark": "bar" ,
"encoding": {
  "x": {"field": "month", "type": "temporal", "timeUnit": "month","title": "Month"},
  "y": {"field": "weather", "type": "quantitative", "aggregate": "count","title": "Count of Days"},
  "color": {"field": "weather", "title": "Weather Type"},
  "xOffset": {"field": "weather"}
}
}' |> as_vegaspec()
'{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {
  "url": "https://calvin-data304.netlify.app/data/weather-with-dates.csv"
},
"height": 200,
"width": 175,
"title": "Seattle Weather Distribution (2012-2015) Faceted By Weather Type",
"mark": "bar" ,
"encoding": {
  "x": {"field": "month", "type": "temporal", "timeUnit": "month","title": "Month"},
  "y": {"field": "weather", "type": "quantitative", "aggregate": "count","title": "Count of Days"},
  "color": {"field": "weather", "title": "Weather Type"},
  "xOffset": {"field": "weather"},
  "facet": {"field": "weather", "columns": 3,  "title": ""}
}
}' |> as_vegaspec()